Model Selection

CLIP multimodal fusion

# CLIP multimodal fusion

Kandinsky 2 2 Decoder

Kandinsky 2.2 is a text-to-image generation model based on best practices from Dall-E 2 and latent diffusion models, utilizing CLIP as the text and image encoder to enhance visual expressiveness.

kandinsky-community

Kandinsky 2 1 Inpaint

Kandinsky 2.1 is a text-to-image generation model based on best practices from Dall-E 2 and latent diffusion models, utilizing CLIP as the text and image encoder to enhance visual expressiveness.

kandinsky-community

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase